library(MASS)
library(biotools)
## ---
## biotools version 4.3
library(klaR)
library(car)
## Loading required package: carData
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
##
## recode
## The following object is masked from 'package:MASS':
##
## select
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(ggplot2)
library(ggExtra)
library(heplots)
## Loading required package: broom
## Warning in rgl.init(initValue, onlyNULL): RGL: unable to open X11 display
## Warning: 'rgl.init' failed, will use the null device.
## See '?rgl.useNULL' for ways to avoid this warning.
##
## Attaching package: 'heplots'
## The following object is masked from 'package:biotools':
##
## boxM
library(readr)
library(caret)
## Loading required package: lattice
library(vegan)
## Loading required package: permute
## Registered S3 methods overwritten by 'vegan':
## method from
## plot.rda klaR
## predict.rda klaR
## print.rda klaR
##
## Attaching package: 'vegan'
## The following object is masked from 'package:caret':
##
## tolerance
## The following object is masked from 'package:klaR':
##
## rda
The dataset from UCI Machine Learning Respository includes 244 instances that regroup a data of two regions of Algeria,namely the Bejaia region located in the northeast of Algeria and the Sidi Bel-abbes region located in the northwest of Algeria.
not fire/fire)df <- read_csv("algerian_forest_fires.csv", skip = 1)
## Rows: 244 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): DC, FWI, Classes, Region
## dbl (11): day, month, year, Temperature, RH, Ws, Rain, FFMC, DMC, ISI, BUI
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
df <- df[complete.cases(df), ]
df$day <- as.integer(df$day)
df$month <- factor(df$month, levels = 6:9,
labels = c("Jun","Jul","Aug","Sep"))
df$year <- as.factor(df$year)
df$DC <- as.numeric(df$DC)
df$FWI <- as.numeric(df$FWI)
df$Classes <- trimws(df$Classes)
df$Classes <- factor(df$Classes)
df$Region <- factor(df$Region)
df$sqWs <- sqrt(df$Ws + 1)
df$tRain <- 1 / ((df$Rain + 1)^3)
df$tFFMC <- (df$FFMC)^3
df$logDMC <- log(df$DMC + 1)
df$logDC <- log(df$DC + 1)
df$logFWI <- log(df$FWI + 1)
In this assignment:
Temperature, RH, and
Ws.Classes (fire vs. not fire),
Region (Bejaia vs. Sidi-Bel Abbes), and month
(Jun, Jul, Aug, Sep).Classes and RegionWe start by creating interaction plots for Temperature,
RH, and Ws, grouped by the first two
categorical predictors Classes and Region,
across all months.
resp.vars <- c("Temperature","RH", "Ws")
for (v in resp.vars) {
interaction.plot(
x.factor = df$Classes,
trace.factor = df$Region,
response = df[[v]],
type = 'b', lwd = 2,
trace.label = "Region",
lty = c(3, 1), col = c(4, 2), pch = c(16, 4),
xlab = " ",
ylab = paste("Mean", v),
main = paste("Interaction Plot for", v),
)
grid()
}
Observations:
In both regions, temperatures tend to be higher on fire days than on non-fire days. Sidi-Bel Abbes generally records higher temperatures than Bejaia. The lines are nearly parallel, suggesting little to none interaction between Classes and Region.
RH is lower on fire days, especially in Sidi-Bel Abbes. A possible interaction is observed: the RH difference between fire and not fire is greater in Sidi-Bel Abbes.
In Sidi-Bel Abbes, mean wind speed is slightly lower on non-fire days than on fire days. In Bejaia, it’s the opposite: wind speed is actually higher on non-fire days than on fire days. This seems a bit counter-intuitive: as it makes more sense for fire risk to increase with higher the wind speeds, the faster the fire spreads. This negative relationship might suggests that wind speed is not a major contributor to fire occurances in the region.
Out of curiosity, I made interaction plots for the response variables
grouped by all three categorical predictors Classes,
Region, and months, where Classes = fire is in
red, Classes = not fire in blue, Region = Bejaia is in dashed line, and
Region = Sidi-Bel Abbes is in solid line.
for (v in resp.vars) {
interaction.plot(
x.factor = df$month,
trace.factor = interaction(df$Classes, df$Region),
response = df[[v]],
type = 'b', lwd = 2,
trace.label = "Region-Class",
lty = c(3,3, 1,1), col = c(2, 4, 2, 4), pch = c(16, 16, 4, 4),
xlab = "Month",
ylab = paste("Mean", v),
main = paste("Interaction Plot for", v),
)
}
Observations:
Temperatures peak in August across all groups. Sidi-Bel Abbes seems to be more fire prone and generally records higher temperatures than Bejaia. For both regions, temperatures tend to be at least one degree higher on fire days than on non-fire days.
RH is generally inversely related to fire occurrence: fire groups (red) tend to have lower humidity, especially Sidi-Bel Abbes in August (~45%). Bejaia is more humid than Sidi-Bel Abbes. Bejaia—not fire shows the highest RH throughout (~75% in June). This pattern support the hypothesis that drier air contributes to fire events, especially during midsummer.
A possible outlier is observed: the RH is higher under fire conditions than non-fire conditions in Bajaia in August, whereas in all other region–class combinations and months, RH consistently drops by approximately 10% from not fire to fire,
options(contrasts = c("contr.sum","contr.poly"))
mod_mva <- lm(
cbind(Temperature, RH, Ws) ~ Classes *Region, data = df
)
summary(Anova(mod_mva, type = 3), univariate = TRUE)
##
## Type III MANOVA Tests:
##
## Sum of squares and products for error:
## Temperature RH Ws
## Temperature 2220.3684 -4706.159 -540.3526
## RH -4706.1593 36708.699 1654.7711
## Ws -540.3526 1654.771 1833.5755
##
## ------------------------------------------
##
## Term: (Intercept)
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 236427.3 466722.2 114607.2
## RH 466722.2 921338.8 226241.8
## Ws 114607.2 226241.8 55555.4
##
## Multivariate Tests: (Intercept)
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.99682 24784.08 3 237 < 2.22e-16 ***
## Wilks 1 0.00318 24784.08 3 237 < 2.22e-16 ***
## Hotelling-Lawley 1 313.72253 24784.08 3 237 < 2.22e-16 ***
## Roy 1 313.72253 24784.08 3 237 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Classes
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 733.06133 -2348.1411 -43.972028
## RH -2348.14115 7521.5628 140.851144
## Ws -43.97203 140.8511 2.637623
##
## Multivariate Tests: Classes
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.2755645 30.05043 3 237 < 2.22e-16 ***
## Wilks 1 0.7244355 30.05043 3 237 < 2.22e-16 ***
## Hotelling-Lawley 1 0.3803851 30.05043 3 237 < 2.22e-16 ***
## Roy 1 0.3803851 30.05043 3 237 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Region
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 105.25526 -758.3203 -81.55311
## RH -758.32027 5463.3815 587.55618
## Ws -81.55311 587.5562 63.18839
##
## Multivariate Tests: Region
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.1386396 12.71538 3 237 9.8157e-08 ***
## Wilks 1 0.8613604 12.71538 3 237 9.8157e-08 ***
## Hotelling-Lawley 1 0.1609542 12.71538 3 237 9.8157e-08 ***
## Roy 1 0.1609542 12.71538 3 237 9.8157e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Classes:Region
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 3.905114 -45.08995 7.16896
## RH -45.089947 520.62588 -82.77557
## Ws 7.168960 -82.77557 13.16069
##
## Multivariate Tests: Classes:Region
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0258281 2.094519 3 237 0.10162
## Wilks 1 0.9741719 2.094519 3 237 0.10162
## Hotelling-Lawley 1 0.0265129 2.094519 3 237 0.10162
## Roy 1 0.0265129 2.094519 3 237 0.10162
##
## Type III Sums of Squares
## df Temperature RH Ws
## (Intercept) 1 2.3643e+05 921338.84 55555.4047
## Classes 1 7.3306e+02 7521.56 2.6376
## Region 1 1.0526e+02 5463.38 63.1884
## Classes:Region 1 3.9051e+00 520.63 13.1607
## residuals 239 2.2204e+03 36708.70 1833.5755
##
## F-tests
## Temperature RH Ws
## (Intercept) 25448.98 5998.58 7241.45
## Classes 78.91 48.97 0.34
## Region 11.33 35.57 8.24
## Classes:Region 0.42 3.39 1.72
##
## p-values
## Temperature RH Ws
## (Intercept) < 2.22e-16 < 2.22e-16 < 2.22e-16
## Classes < 2.22e-16 2.6002e-11 0.55819465
## Region 0.00088866 8.7923e-09 0.00447376
## Classes:Region 0.51738702 0.06684634 0.19153751
1. Multivariate tests
All the tests show highly significant main effects for both
Classes and Region (p-values << 0.001),
i.e. fire vs non-fire differ multivariately, so do Sidi-Bel Abbes vs
Bejaia. However, there is no significant
Classes×Region interaction, i.e. there is no
strong evidence that the fire–non-fire difference itself changes between
regions.
2. Univariate follow-ups
Classes) is significantly associated
with higher temperature and lower relative humidity (p-value <<
0.01), but not significantly associated with wind speed (p =
0.558).Region significantly affects all three meteorological
variables, i.e. Bejaia and Sidi-Bel Abbes have different
climates/microclimates.cqplot(mod_mva$residuals, label = "Residuals MANOVA")
In the chi-square qq plot, most points track the line quite closely except a few pointing a little upwards towards the tail, but this is good enough for us to say that the multivariate‐normality assumption is satisfied.
Since the two-way MANOVA for Region x Classes is not statistically
significant, I decided to add month as a third factor. By doing so, I
wanted to ask if the shift in joint distribution of Temperature, RH and
Wind attributed to Classes and Region also
change over the seasonal cycle.
mod_mva3way <- lm(
cbind(Temperature, RH, Ws) ~ Classes * Region * month,
data = df
)
summary(Anova(mod_mva3way, type = 3), univariate = TRUE)
##
## Type III MANOVA Tests:
##
## Sum of squares and products for error:
## Temperature RH Ws
## Temperature 1431.0946 -3324.380 -586.9317
## RH -3324.3801 32901.958 1515.0685
## Ws -586.9317 1515.068 1726.7833
##
## ------------------------------------------
##
## Term: (Intercept)
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 190477.65 370048.4 90842.15
## RH 370048.44 718907.7 176482.64
## Ws 90842.15 176482.6 43324.23
##
## Multivariate Tests: (Intercept)
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.9973 27442.93 3 225 < 2.22e-16 ***
## Wilks 1 0.0027 27442.93 3 225 < 2.22e-16 ***
## Hotelling-Lawley 1 365.9057 27442.93 3 225 < 2.22e-16 ***
## Roy 1 365.9057 27442.93 3 225 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Classes
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 214.73606 -903.0670 -25.660381
## RH -903.06699 3797.8251 107.914075
## Ws -25.66038 107.9141 3.066346
##
## Multivariate Tests: Classes
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.1620981 14.50928 3 225 1.1302e-08 ***
## Wilks 1 0.8379019 14.50928 3 225 1.1302e-08 ***
## Hotelling-Lawley 1 0.1934571 14.50928 3 225 1.1302e-08 ***
## Roy 1 0.1934571 14.50928 3 225 1.1302e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Region
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 115.80863 -691.2048 -66.89517
## RH -691.20483 4125.4623 399.26442
## Ws -66.89517 399.2644 38.64102
##
## Multivariate Tests: Region
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.1256448 10.7775 3 225 1.2069e-06 ***
## Wilks 1 0.8743552 10.7775 3 225 1.2069e-06 ***
## Hotelling-Lawley 1 0.1437000 10.7775 3 225 1.2069e-06 ***
## Roy 1 0.1437000 10.7775 3 225 1.2069e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: month
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 633.62565 -989.70150 48.58209
## RH -989.70150 2083.23480 34.93185
## Ws 48.58209 34.93185 27.09281
##
## Multivariate Tests: month
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 3 0.3893691 11.28549 9 681.0000 < 2.22e-16 ***
## Wilks 3 0.6212120 13.14959 9 547.7415 < 2.22e-16 ***
## Hotelling-Lawley 3 0.5927269 14.73036 9 671.0000 < 2.22e-16 ***
## Roy 3 0.5624664 42.55996 3 227.0000 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Classes:Region
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 3.974519 -32.80782 9.197735
## RH -32.807819 270.81337 -75.923044
## Ws 9.197735 -75.92304 21.285170
##
## Multivariate Tests: Classes:Region
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0279156 2.153795 3 225 0.094333 .
## Wilks 1 0.9720844 2.153795 3 225 0.094333 .
## Hotelling-Lawley 1 0.0287173 2.153795 3 225 0.094333 .
## Roy 1 0.0287173 2.153795 3 225 0.094333 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Classes:month
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 14.309106 -66.40631 -8.703779
## RH -66.406313 350.08024 48.683197
## Ws -8.703779 48.68320 58.200541
##
## Multivariate Tests: Classes:month
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 3 0.0491600 1.260580 9 681.0000 0.255094
## Wilks 3 0.9513511 1.260010 9 547.7415 0.255978
## Hotelling-Lawley 3 0.0506000 1.257503 9 671.0000 0.256905
## Roy 3 0.0362476 2.742732 3 227.0000 0.043984 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Region:month
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 12.004969 -57.49608 -2.105804
## RH -57.496077 443.79159 49.422918
## Ws -2.105804 49.42292 9.843280
##
## Multivariate Tests: Region:month
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 3 0.0242358 0.6162591 9 681.0000 0.78376
## Wilks 3 0.9759051 0.6129820 9 547.7415 0.78643
## Hotelling-Lawley 3 0.0245454 0.6099987 9 671.0000 0.78910
## Roy 3 0.0149135 1.1284560 3 227.0000 0.33832
##
## ------------------------------------------
##
## Term: Classes:Region:month
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 6.713160 -25.30107 -6.758531
## RH -25.301071 242.46485 30.175037
## Ws -6.758531 30.17504 7.542430
##
## Multivariate Tests: Classes:Region:month
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 3 0.0130245 0.3299399 9 681.0000 0.96515
## Wilks 3 0.9870151 0.3277180 9 547.7415 0.96585
## Hotelling-Lawley 3 0.0131156 0.3259475 9 671.0000 0.96654
## Roy 3 0.0085968 0.6504900 3 227.0000 0.58340
##
## Type III Sums of Squares
## df Temperature RH Ws
## (Intercept) 1 1.9048e+05 718907.72 43324.2285
## Classes 1 2.1474e+02 3797.83 3.0663
## Region 1 1.1581e+02 4125.46 38.6410
## month 3 6.3363e+02 2083.23 27.0928
## Classes:Region 1 3.9745e+00 270.81 21.2852
## Classes:month 3 1.4309e+01 350.08 58.2005
## Region:month 3 1.2005e+01 443.79 9.8433
## Classes:Region:month 3 6.7132e+00 242.46 7.5424
## residuals 227 1.4311e+03 32901.96 1726.7833
##
## F-tests
## Temperature RH Ws
## (Intercept) 30213.53 4959.95 5695.33
## Classes 11.35 26.20 0.13
## Region 6.12 9.49 5.08
## month 100.51 14.37 1.19
## Classes:Region 0.63 0.62 0.93
## Classes:month 0.76 2.42 7.65
## Region:month 1.90 1.02 1.29
## Classes:Region:month 0.35 0.56 0.33
##
## p-values
## Temperature RH Ws
## (Intercept) < 2.22e-16 < 2.22e-16 < 2.22e-16
## Classes 5.7533e-07 6.5502e-07 0.93949273
## Region 0.00050652 6.2592e-06 0.02516277
## month < 2.22e-16 0.00019222 0.31539600
## Classes:Region 0.42802408 0.60092349 0.42562534
## Classes:month 0.51956923 0.12154775 0.00614145
## Region:month 0.16896363 0.38427470 0.25651488
## Classes:Region:month 0.78560528 0.64354283 0.80330136
1. Multivariate results
When testing Temperature, RH and Wind Speed jointly, all three main
effects (Classes, Region and
month) are highly significant (each \(p << 0.01\)), but none of the
two-or-three-way interactions reach significance on Pillai’s trace (all
\(p > 0.05\)). Roy’s largest-root
test for the fire×month term, however, is \(p
\approx 0.044\), indicating that there is at least one specific
linear combination of (T, RH, Ws) in which the fire×month interaction is
detectable.
Follow up univariately: in the univariate F‐tests, only Ws has a significant Classes×month effect (F=7.65, p≈0.006), which is the exact response driving Roy’s result.
2. Univariate follow-ups
Looking at each response by itself, Temperature and RH behave as expected: the three response variables each explain a highly significant portion of their variability, but none of their interactions do.
On the other hand, for wind speed, region remains significant (\(p \approx 0.025\)), month does not (\(p \approx 0.32\)), yet the Classes × month
F-test for wind has \(p \approx
0.006\), which is significant. In other words, in some months the
wind speed effect on fire days is larger compared to non-fire days.
Ws is the exact response driving Roy’s result on the
Classes×month effect in the multivariate results.
options(contrasts = c("contr.treatment", "contr.poly"))
contrasts(df$Region)
## Sidi-Bel Abbes
## Bejaia 0
## Sidi-Bel Abbes 1
contrasts(df$Classes)
## not fire
## fire 0
## not fire 1
Therefore, Region1 is 1 for Sidi‐Bel Abbes (0 for
Bejaia), and Classes1 is 1 for not‐fire days (0 for fire
days).
rownames(coef(mod_mva))
## [1] "(Intercept)" "Classes1" "Region1" "Classes1:Region1"
# Test 1
linearHypothesis(mod_mva, "Region1 = 0")
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 105.25526 -758.3203 -81.55311
## RH -758.32027 5463.3815 587.55618
## Ws -81.55311 587.5562 63.18839
##
## Sum of squares and products for error:
## Temperature RH Ws
## Temperature 2220.3684 -4706.159 -540.3526
## RH -4706.1593 36708.699 1654.7711
## Ws -540.3526 1654.771 1833.5755
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.1386396 12.71538 3 237 9.8157e-08 ***
## Wilks 1 0.8613604 12.71538 3 237 9.8157e-08 ***
## Hotelling-Lawley 1 0.1609542 12.71538 3 237 9.8157e-08 ***
## Roy 1 0.1609542 12.71538 3 237 9.8157e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This tests \(H_0: \mu_{B} =
\mu_{S}\). All four multivariate test statistics agree with \(p << 0.01\), so we reject \(H_0\) and conclude that the two regions
have significantly different joint means of
(Temperature, RH, Ws).
# Test 2
linearHypothesis(mod_mva, "Classes1 + 0.5*Classes1:Region1 = 0")
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 565.80576 -1633.5908 -74.566426
## RH -1633.59083 4716.4932 215.287715
## Ws -74.56643 215.2877 9.826962
##
## Sum of squares and products for error:
## Temperature RH Ws
## Temperature 2220.3684 -4706.159 -540.3526
## RH -4706.1593 36708.699 1654.7711
## Ws -540.3526 1654.771 1833.5755
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.2142208 21.53715 3 237 2.2685e-12 ***
## Wilks 1 0.7857792 21.53715 3 237 2.2685e-12 ***
## Hotelling-Lawley 1 0.2726221 21.53715 3 237 2.2685e-12 ***
## Roy 1 0.2726221 21.53715 3 237 2.2685e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This tests whether the average fire-effect across both regions is zero, i.e. \[H_0:\ \frac{(\mu_{fire,B} - \mu_{nonfire,B}) + (\mu_{fire,S} - \mu_{nonfire,S})}{2} = 0\]
The resulting p-values are << 0.01, so we reject \(H_0\). Even when averaging the fire vs. non-fire difference across regions, the effect remains highly significant.
# Test 3
linearHypothesis(mod_mva,"Classes1 + Classes1:Region1 = 0")
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 330.59345 -841.7929 -69.17514
## RH -841.79289 2143.4643 176.14124
## Ws -69.17514 176.1412 14.47458
##
## Sum of squares and products for error:
## Temperature RH Ws
## Temperature 2220.3684 -4706.159 -540.3526
## RH -4706.1593 36708.699 1654.7711
## Ws -540.3526 1654.771 1833.5755
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.1315616 11.96789 3 237 2.5255e-07 ***
## Wilks 1 0.8684384 11.96789 3 237 2.5255e-07 ***
## Hotelling-Lawley 1 0.1514922 11.96789 3 237 2.5255e-07 ***
## Roy 1 0.1514922 11.96789 3 237 2.5255e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This tests whether fire vs. non-fire differ in Sidi-Bel Abbes alone. The results suggest that we reject \(H_0\). In Sidi-Bel Abbes, fire days differ significantly from non-fire days.
# Test 4
linearHypothesis(mod_mva, "Classes1:Region1 = 0")
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 3.905114 -45.08995 7.16896
## RH -45.089947 520.62588 -82.77557
## Ws 7.168960 -82.77557 13.16069
##
## Sum of squares and products for error:
## Temperature RH Ws
## Temperature 2220.3684 -4706.159 -540.3526
## RH -4706.1593 36708.699 1654.7711
## Ws -540.3526 1654.771 1833.5755
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0258281 2.094519 3 237 0.10162
## Wilks 1 0.9741719 2.094519 3 237 0.10162
## Hotelling-Lawley 1 0.0265129 2.094519 3 237 0.10162
## Roy 1 0.0265129 2.094519 3 237 0.10162
None of the test results were significant. This confirms again that there is no significant Classes×Region interaction. In other words, fire vs non-fire is comparable in both regions, and the three basic meteorological measurements could be good universal indicators for fire prediction, regardless of region.
Performing the univariate followup based on the first multivariate contrast (test 1) we have performed:
options(contrasts = c("contr.sum","contr.poly"))
mod.T <- lm(Temperature ~ Classes * Region, data = df)
mod.RH <- lm(RH ~ Classes * Region, data = df)
mod.Ws <- lm(Ws ~ Classes * Region, data = df)
names(coef(mod.T))
## [1] "(Intercept)" "Classes1" "Region1" "Classes1:Region1"
linearHypothesis(mod.T, "Region1 = 0")
##
## Linear hypothesis test:
## Region1 = 0
##
## Model 1: restricted model
## Model 2: Temperature ~ Classes * Region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 240 2325.6
## 2 239 2220.4 1 105.25 11.33 0.0008887 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linearHypothesis(mod.RH, "Region1 = 0")
##
## Linear hypothesis test:
## Region1 = 0
##
## Model 1: restricted model
## Model 2: RH ~ Classes * Region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 240 42172
## 2 239 36709 1 5463.4 35.57 8.792e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linearHypothesis(mod.Ws, "Region1 = 0")
##
## Linear hypothesis test:
## Region1 = 0
##
## Model 1: restricted model
## Model 2: Ws ~ Classes * Region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 240 1896.8
## 2 239 1833.6 1 63.188 8.2364 0.004474 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The univariate confirm that all three responses differ significantly
by region, specially Temperature and RH.
linearHypothesis(mod.T, "Classes1:Region1 = 0")
##
## Linear hypothesis test:
## Classes1:Region1 = 0
##
## Model 1: restricted model
## Model 2: Temperature ~ Classes * Region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 240 2224.3
## 2 239 2220.4 1 3.9051 0.4203 0.5174
linearHypothesis(mod.RH, "Classes1:Region1 = 0")
##
## Linear hypothesis test:
## Classes1:Region1 = 0
##
## Model 1: restricted model
## Model 2: RH ~ Classes * Region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 240 37229
## 2 239 36709 1 520.63 3.3896 0.06685 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linearHypothesis(mod.Ws, "Classes1:Region1 = 0")
##
## Linear hypothesis test:
## Classes1:Region1 = 0
##
## Model 1: restricted model
## Model 2: Ws ~ Classes * Region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 240 1846.7
## 2 239 1833.6 1 13.161 1.7154 0.1915
Zooming in to the univariate breakdown of Region x Classes interaction, none of these univariate p-values drops below 0.05. The slight signal in RH (\(p\approx 0.067\)) matches what we saw before but both temperature and wind show no region‐specific fire effect.
pairs(df[c("Rain","Temperature","RH","Ws")],
pch = 20,
col = as.numeric(df$Classes),
main = "logFWI vs. Responses (color = fire class)",
panel = function(x, y, ...) {
points(x, y, ...)
abline(lm(y ~ x), col = 'grey', lwd = 2)
}
)
In the pair-plots, as Rain increases, average
temperature tends to drop, relative humidity rises, and wind speed
slightly increase). Considering that rainfall data is always
non-negative, and its distribution tend to be naturally highly right
skewed, and there isn’t too much to do to change that (after several
testing), we can loosely accept that the variables are linearly
associated.
mod_mva2 <- manova(
cbind(Temperature, RH, Ws) ~ Classes * Region + Rain,
data = df
)
summary(Anova(mod_mva2, type = 3), univariate = TRUE)
##
## Type III MANOVA Tests:
##
## Sum of squares and products for error:
## Temperature RH Ws
## Temperature 2151.9642 -4566.988 -482.7096
## RH -4566.9875 36425.547 1537.4935
## Ws -482.7096 1537.493 1785.0008
##
## ------------------------------------------
##
## Term: (Intercept)
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 199266.70 387262.8 94573.09
## RH 387262.81 752621.9 183797.08
## Ws 94573.09 183797.1 44884.91
##
## Multivariate Tests: (Intercept)
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.9962 20602.26 3 236 < 2.22e-16 ***
## Wilks 1 0.0038 20602.26 3 236 < 2.22e-16 ***
## Hotelling-Lawley 1 261.8932 20602.26 3 236 < 2.22e-16 ***
## Roy 1 261.8932 20602.26 3 236 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Classes
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 482.25473 -1624.76762 24.658738
## RH -1624.76762 5474.01544 -83.077920
## Ws 24.65874 -83.07792 1.260855
##
## Multivariate Tests: Classes
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.2175733 21.87524 3 236 1.5536e-12 ***
## Wilks 1 0.7824267 21.87524 3 236 1.5536e-12 ***
## Hotelling-Lawley 1 0.2780751 21.87524 3 236 1.5536e-12 ***
## Roy 1 0.2780751 21.87524 3 236 1.5536e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Region
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 108.80216 -774.5074 -84.42426
## RH -774.50743 5513.3259 600.97351
## Ws -84.42426 600.9735 65.50840
##
## Multivariate Tests: Region
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.1421553 13.03602 3 236 6.5918e-08 ***
## Wilks 1 0.8578447 13.03602 3 236 6.5918e-08 ***
## Hotelling-Lawley 1 0.1657121 13.03602 3 236 6.5918e-08 ***
## Roy 1 0.1657121 13.03602 3 236 6.5918e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Rain
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 68.40418 -139.1717 -57.64301
## RH -139.17174 283.1519 117.27760
## Ws -57.64301 117.2776 48.57476
##
## Multivariate Tests: Rain
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0454227 3.743279 3 236 0.011759 *
## Wilks 1 0.9545773 3.743279 3 236 0.011759 *
## Hotelling-Lawley 1 0.0475841 3.743279 3 236 0.011759 *
## Roy 1 0.0475841 3.743279 3 236 0.011759 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Classes:Region
##
## Sum of squares and products for the hypothesis:
## Temperature RH Ws
## Temperature 4.494424 -48.98714 7.432159
## RH -48.987142 533.93721 -81.007096
## Ws 7.432159 -81.00710 12.290115
##
## Multivariate Tests: Classes:Region
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0256567 2.071472 3 236 0.10468
## Wilks 1 0.9743433 2.071472 3 236 0.10468
## Hotelling-Lawley 1 0.0263323 2.071472 3 236 0.10468
## Roy 1 0.0263323 2.071472 3 236 0.10468
##
## Type III Sums of Squares
## df Temperature RH Ws
## (Intercept) 1 1.9927e+05 752621.90 44884.9124
## Classes 1 4.8225e+02 5474.02 1.2609
## Region 1 1.0880e+02 5513.33 65.5084
## Rain 1 6.8404e+01 283.15 48.5748
## Classes:Region 1 4.4944e+00 533.94 12.2901
## residuals 238 2.1520e+03 36425.55 1785.0008
##
## F-tests
## Temperature RH Ws
## (Intercept) 22038.23 4917.54 5984.65
## Classes 53.34 35.77 0.17
## Region 12.03 36.02 8.73
## Rain 7.57 1.85 6.48
## Classes:Region 0.50 3.49 1.64
##
## p-values
## Temperature RH Ws
## (Intercept) < 2.22e-16 < 2.22e-16 < 2.22e-16
## Classes 4.2068e-12 8.0953e-09 0.68216229
## Region 0.00062022 7.2193e-09 0.00343621
## Rain 0.00640764 0.17506110 0.01156286
## Classes:Region 0.48148058 0.06301886 0.20175279
summary.aov(mod_mva2)
## Response Temperature :
## Df Sum Sq Mean Sq F value Pr(>F)
## Classes 1 848.17 848.17 93.8052 < 2.2e-16 ***
## Region 1 112.92 112.92 12.4885 0.0004918 ***
## Rain 1 67.81 67.81 7.5001 0.0066362 **
## Classes:Region 1 4.49 4.49 0.4971 0.4814806
## Residuals 238 2151.96 9.04
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response RH :
## Df Sum Sq Mean Sq F value Pr(>F)
## Classes 1 9938 9937.6 64.9309 3.750e-14 ***
## Region 1 6043 6042.7 39.4822 1.563e-09 ***
## Rain 1 270 269.8 1.7631 0.18551
## Classes:Region 1 534 533.9 3.4887 0.06302 .
## Residuals 238 36426 153.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response Ws :
## Df Sum Sq Mean Sq F value Pr(>F)
## Classes 1 9.36 9.363 1.2484 0.264993
## Region 1 56.64 56.642 7.5522 0.006453 **
## Rain 1 49.45 49.445 6.5927 0.010852 *
## Classes:Region 1 12.29 12.290 1.6387 0.201753
## Residuals 238 1785.00 7.500
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multivariate:
After adjusting for Rain, Classes and
Region remain highly significant jointly on
(Temperature, RH, Ws). Rain also shows a small
but significant multivariate effect (\(p
\approx 0.012\)), i.e. adding it to the model explains extra
variation across T, RH and Ws. The Classes × Region interaction stays
non-significant.
Univariate:
Temperature ~ Rain:
Fire days remain much hotter than non-fire days, S-B Abbes remains cooler than Bejaia, and each extra millimeter of rain cools the day significantly.
RH ~ Rain:
Fires are still drier and S-B Abbes more humid, but rainfall itself
no longer shows a clear univariate effect on RH. This suggests that the
big humidity contrasts are driven by Classes and
Region rather than by rainfall.
Ws ~ Rain:
Wind still does not differ by fire class, but remains higher in S-B Abbes and now also increases modestly with rainfall.
Also note that after we account for rain, the Classes F–stat for Temperature drops from about 78.9 (mod_mva) to 53.3 (mod_mva2), and for RH from about 48.97 to 35.8. This shrinking means part of what we originally attributed to fire effects was actually just days with lower rainfall.
In contrast, the Region F–statistics stay right around 11–12 for Temperature and 35–36 for RH, so the climate difference between Bejaia and Sidi‐Bel Abbes isn’t explained away by rainfall.
mod_mva2 <- manova(cbind(Temperature, RH, Ws) ~ Classes*Region + Rain, data = df)
cqplot(mod_mva2$residuals, label = "MANOVA residuals")
The chi‐square qq plot seems to show reasonably multivariately normal distribution of the residuals. Still, considering the handful of points lying above the upper band, we could try to look at the residuals vs. fitted plots for our dependent variables and perform a box-cox transformation.
mod.T2 <- lm(Temperature ~ Classes * Region + Rain, data = df)
plot(mod.T2, which = c(1,2), pch = 19, col = 'blue')
mod.RH2 <- lm(RH ~ Classes * Region + Rain, data = df)
plot(mod.RH2, which = c(1,2), pch = 19, col = 'blue')
mod.Ws2 <- lm(Ws ~ Classes * Region + Rain, data = df)
plot(mod.Ws2, which = c(1,2), pch = 19, col = 'blue')
There is some heteroskedasticity and uneven spread in the Residuals vs. Fitted plot for all there variabels.
bcT <- powerTransform(mod.T2)
(lamT <- bcT$x[which.max(bcT$y)])
## [1] 1
bcH <- powerTransform(mod.RH2)
(lamH <- bcH$x[which.max(bcH$y)])
## [1] 1
bcW <- powerTransform(mod.Ws2)
(lamW <- bcW$x[which.max(bcW$y)])
## [1] 1
Interestingly, for each margin, the box–cox log-likelihood is maximized at \(\lambda = 1\), i.e. no power transformation. This probably means that the residuals are heavy tailed and aren’t skewed in a way a simple power could fix.
Moreover, because Pillai’s trace is fairly robust to mild normality violations, the MANOVA conclusions should still remain strong.
(mrpp1 <- mrpp(df[,c("Temperature","RH","Ws")], df$Classes))
##
## Call:
## mrpp(dat = df[, c("Temperature", "RH", "Ws")], grouping = df$Classes)
##
## Dissimilarity index: euclidean
## Weights for groups: n
##
## Class means and counts:
##
## fire not fire
## delta 17.42 16
## n 137 106
##
## Chance corrected within-group agreement A: 0.09042
## Based on observed delta 16.8 and expected delta 18.47
##
## Significance of delta: 0.001
## Permutation: free
## Number of permutations: 999
The average within‐group distance in the 3d space of
Temperature, RH and Ws was 16.8,
lower than the expected value of 18.47 if fire and non-fire days were
drawn from the same distribution. The resulting of A is ~0.09 and
permutation p-value is 0.001, meaning fire status explains roughly 9% of
the total variation (pretty weak) and that the clustering of days by
fire vs. non-fire is highly unlikely by chance.
Meteorological conditions differ meaningfully between fire and non-fire days. Patterns are consistent across regions and months, with temperature and humidity as the strongest indicators. Wind speed seems to show inconsistent associations and contributs less to the multivariate separation. Rainfall adds modest explanatory power but doesn’t override the main effects. Lastly, the overall clustering of fire vs. non-fire days, while statistically significant, was relatively weak.
Given these limitations, I thought MANOVA did not yield particularly novel insights for this dataset. To further explore MANOVA, I repeated this procedure on the loaner datasets, hoping to see stornger group separation.
crime <- read_csv("ohiocrimehm.csv")
## Rows: 559 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (12): V10, V12, V16, V23, V64, V67, V70, V71, V72, V73, V86, V87
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Our four dependent variables are:
And 2 predictors (plus their interaction term):
responses <- c("V10","V12","V16","V23")
names(responses) <- c(
"GovtJobs",
"RecreationPrograms",
"DrugTreatment",
"FamilyHelp"
)
crime$Gender <- factor(crime$V70, levels=c(0,1), labels=c("Female","Male"))
crime$Race <- factor(crime$V71, levels=c(1,2,3),
labels=c("White","Black","Other"))
crime$Education <- factor(crime$V72, levels=1:8,
labels=c("noHS", "someHS", "HSGrad",
"College1yr", "College2yr", "College3yr",
"CollegeGrad","GradDegree"))
for (v in responses) {
interaction.plot(
x.factor = crime$Education,
trace.factor = crime$Gender,
response = crime[[v]],
type = "b", lwd = 2,
lty = 1, col = c(2, 1, 4), pch = 16:18,
trace.label = "Race",
xlab = '',
ylab = paste("Mean", v),
main = paste("Interaction Plot for", v),
cex.axis = 0.6
)
grid()
}
For V10, females generally exhibit higher mean scores than males, with both genders following relatively parallel trends across education levels, indicating limited interaction.
In both V12 and V16, the patterns are more irregular, though females still tend to report higher mean scores. Male responses show less variability and consistently lower values, suggesting potential gender differences.
In V23, data for females seem to be limited, but among males, there is strong support for family involvement among those with no high school education. However, this support declines among those with 2–3 years of college education.
Overall, the presence of intersecting lines in these plots suggests interaction effects between gender and education. However, there seem to be a consistent dip in scores at the College3yr level across variables and genders. This may reflect not just genuine attitude differences but also survey-related reasons, for example interpretation ambiguity or response tendencies within this education subgroup.
options(contrasts = c("contr.sum","contr.poly"))
crime_mva <- lm(
cbind(V10, V12, V16, V23) ~ Education *Gender, data = crime
)
summary(Anova(crime_mva, type = 3), univariate = TRUE)
##
## Type III MANOVA Tests:
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## ------------------------------------------
##
## Term: (Intercept)
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 5626.638 5070.704 4757.690 5486.543
## V12 5070.704 4569.699 4287.612 4944.452
## V16 4757.690 4287.612 4022.938 4639.231
## V23 5486.543 4944.452 4639.231 5349.937
##
## Multivariate Tests: (Intercept)
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.91252 1374.308 4 527 < 2.22e-16 ***
## Wilks 1 0.08748 1374.308 4 527 < 2.22e-16 ***
## Hotelling-Lawley 1 10.43118 1374.308 4 527 < 2.22e-16 ***
## Roy 1 10.43118 1374.308 4 527 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Education
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 29.474873 13.3177167 21.5957023 5.926393
## V12 13.317717 21.1965724 0.2303029 4.988018
## V16 21.595702 0.2303029 49.9892539 2.081062
## V23 5.926393 4.9880180 2.0810618 3.231626
##
## Multivariate Tests: Education
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 7 0.0901985 1.746717 28 2120.000 0.0091167 **
## Wilks 7 0.9120955 1.755370 28 1901.548 0.0086513 **
## Hotelling-Lawley 7 0.0938807 1.761939 28 2102.000 0.0082081 **
## Roy 7 0.0550295 4.166522 7 530.000 0.0001766 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Gender
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 19.20871 17.31075 18.12485 11.174032
## V12 17.31075 15.60032 16.33399 10.069958
## V16 18.12485 16.33399 17.10215 10.543535
## V23 11.17403 10.06996 10.54353 6.500125
##
## Multivariate Tests: Gender
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0289081 3.922021 4 527 0.0037885 **
## Wilks 1 0.9710919 3.922021 4 527 0.0037885 **
## Hotelling-Lawley 1 0.0297687 3.922021 4 527 0.0037885 **
## Roy 1 0.0297687 3.922021 4 527 0.0037885 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Education:Gender
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 14.4746321 11.53835665 3.27713852 0.6236801
## V12 11.5383567 24.41095765 0.05604968 0.0196499
## V16 3.2771385 0.05604968 14.34669471 -4.9049821
## V23 0.6236801 0.01964990 -4.90498206 6.0417494
##
## Multivariate Tests: Education:Gender
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 7 0.0593362 1.140062 28 2120.000 0.279521
## Wilks 7 0.9418382 1.138088 28 1901.548 0.281986
## Hotelling-Lawley 7 0.0605158 1.135752 28 2102.000 0.284486
## Roy 7 0.0253478 1.919188 7 530.000 0.064482 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Type III Sums of Squares
## df V10 V12 V16 V23
## (Intercept) 1 5626.638 4569.699 4022.938 5349.9372
## Education 7 29.475 21.197 49.989 3.2316
## Gender 1 19.209 15.600 17.102 6.5001
## Education:Gender 7 14.475 24.411 14.347 6.0417
## residuals 530 845.269 1418.549 1158.744 804.9742
##
## F-tests
## V10 V12 V16 V23
## (Intercept) 3528.01 243.91 1840.06 503.20
## Education 18.48 1.13 22.86 0.30
## Gender 12.04 0.83 7.82 0.61
## Education:Gender 9.08 1.30 6.56 0.57
##
## p-values
## V10 V12 V16 V23
## (Intercept) < 2.22e-16 < 2.22e-16 < 2.22e-16 < 2.22e-16
## Education 2.0424e-05 0.34168434 2.2554e-06 0.95208548
## Gender 0.00056188 0.56043493 0.00534780 0.74671260
## Education:Gender 0.00271376 0.24658261 0.01069308 0.78187967
Interpretation:
Both Education and Gender show significant
multivariate effect (p < 0.01), confirming that attitudes differ
individually among different education levels and the two gender.
However, the Education × Gender interaction is not significant (p > 0.40), i.e. no evidence that the gender gap changes across education levels. As with the forest fire data, in this dataset, Roy’s test again comes close (\(p \approx 0.064\)), corresponding to the univariate results where Education × Gender is significant for V10 and V16, but not much from V12 or V23.
Other univariate followups suggest that for V10 and V12, Education, Gender, and their interactions are highly significant. For V12 and V23, however, no significant effects were found, i.e. consistant responses across groups.
cqplot(crime_mva$residuals, label = "Residuals MANOVA")
All data points seem to lie within the bounds, so the residuals are multivariately normally distributed, so assumptions of the model are met.
options(contrasts = c("contr.treatment", "contr.poly"))
contrasts(crime$Gender)
## Male
## Female 0
## Male 1
contrasts(crime$Education)
## someHS HSGrad College1yr College2yr College3yr CollegeGrad
## noHS 0 0 0 0 0 0
## someHS 1 0 0 0 0 0
## HSGrad 0 1 0 0 0 0
## College1yr 0 0 1 0 0 0
## College2yr 0 0 0 1 0 0
## College3yr 0 0 0 0 1 0
## CollegeGrad 0 0 0 0 0 1
## GradDegree 0 0 0 0 0 0
## GradDegree
## noHS 0
## someHS 0
## HSGrad 0
## College1yr 0
## College2yr 0
## College3yr 0
## CollegeGrad 0
## GradDegree 1
rownames(coef(crime_mva))
## [1] "(Intercept)" "Education1" "Education2"
## [4] "Education3" "Education4" "Education5"
## [7] "Education6" "Education7" "Gender1"
## [10] "Education1:Gender1" "Education2:Gender1" "Education3:Gender1"
## [13] "Education4:Gender1" "Education5:Gender1" "Education6:Gender1"
## [16] "Education7:Gender1"
linearHypothesis(crime_mva, "Gender1 = 0")
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 19.20871 17.31075 18.12485 11.174032
## V12 17.31075 15.60032 16.33399 10.069958
## V16 18.12485 16.33399 17.10215 10.543535
## V23 11.17403 10.06996 10.54353 6.500125
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0289081 3.922021 4 527 0.0037885 **
## Wilks 1 0.9710919 3.922021 4 527 0.0037885 **
## Hotelling-Lawley 1 0.0297687 3.922021 4 527 0.0037885 **
## Roy 1 0.0297687 3.922021 4 527 0.0037885 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This tests overall gender effect. The results shows that there is a significant multivariate gender gap.
linearHypothesis(crime_mva, "Education2 = 0")
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 6.1161708 -2.3670829 8.897320 -0.8829047
## V12 -2.3670829 0.9161094 -3.443445 0.3417021
## V16 8.8973204 -3.4434446 12.943116 -1.2843798
## V23 -0.8829047 0.3417021 -1.284380 0.1274524
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0222346 2.99602 4 527 0.018327 *
## Wilks 1 0.9777654 2.99602 4 527 0.018327 *
## Hotelling-Lawley 1 0.0227402 2.99602 4 527 0.018327 *
## Roy 1 0.0227402 2.99602 4 527 0.018327 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The result shows that high‐school graduation vs. no high‐school shift the joint profile of attitudes in a statistically significant way.
linearHypothesis(crime_mva, "Education7 - Education3 = 0")
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 4.770419 4.0207009 -5.740616 1.1859195
## V12 4.020701 3.3888080 -4.838422 0.9995405
## V16 -5.740616 -4.8384216 6.908129 -1.4271090
## V23 1.185919 0.9995405 -1.427109 0.2948179
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0183233 2.459153 4 527 0.044589 *
## Wilks 1 0.9816767 2.459153 4 527 0.044589 *
## Hotelling-Lawley 1 0.0186653 2.459153 4 527 0.044589 *
## Roy 1 0.0186653 2.459153 4 527 0.044589 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is a small but statistically significant shift in the combined attitude vector when going from SomeCollege to GradDegree.
linearHypothesis(crime_mva, "Gender1 + Education7:Gender1 = 0")
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 2.1946681 1.482640 0.7030148 5.020247
## V12 1.4826402 1.001619 0.4749320 3.391501
## V16 0.7030148 0.474932 0.2251957 1.608128
## V23 5.0202469 3.391501 1.6081284 11.483686
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0148352 1.983972 4 527 0.095629 .
## Wilks 1 0.9851648 1.983972 4 527 0.095629 .
## Hotelling-Lawley 1 0.0150586 1.983972 4 527 0.095629 .
## Roy 1 0.0150586 1.983972 4 527 0.095629 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This test looks at the gender gap specifically at the top education level. It turns out that the male–female difference among those with a graduate degree is not significant at alpha = .05 despite a weak trend (\(p\approx .096\)).
linearHypothesis(crime_mva, "Education2:Gender1 = 0")
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 0.77089767 0.62552971 0.62427820 -0.057496661
## V12 0.62552971 0.50757374 0.50655823 -0.046654531
## V16 0.62427820 0.50655823 0.50554475 -0.046561189
## V23 -0.05749666 -0.04665453 -0.04656119 0.004288333
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0013739 0.1812589 4 527 0.9481
## Wilks 1 0.9986261 0.1812589 4 527 0.9481
## Hotelling-Lawley 1 0.0013758 0.1812589 4 527 0.9481
## Roy 1 0.0013758 0.1812589 4 527 0.9481
Based on the resulting p-values, there is no evidence that the male–female gap at HSGrad differs from the male–female gap at noHS.
linearHypothesis(crime_mva, "Education6:Gender1 - Education3:Gender1 = 0")
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 2.2400933 -1.2663431 -1.4528299 0.46559125
## V12 -1.2663431 0.7158742 0.8212967 -0.26320256
## V16 -1.4528299 0.8212967 0.9422441 -0.30196282
## V23 0.4655913 -0.2632026 -0.3019628 0.09677062
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.0069285 0.9191999 4 527 0.45236
## Wilks 1 0.9930715 0.9191999 4 527 0.45236
## Hotelling-Lawley 1 0.0069768 0.9191999 4 527 0.45236
## Roy 1 0.0069768 0.9191999 4 527 0.45236
The gender gap at CollegeGrad is not significantly different from the gender gap at HSGrad.
Overall interaction test:
linearHypothesis(crime_mva, c(
"Education1:Gender1 = 0",
"Education2:Gender1 = 0",
"Education3:Gender1 = 0",
"Education4:Gender1 = 0",
"Education5:Gender1 = 0",
"Education6:Gender1 = 0",
"Education7:Gender1 = 0"
))
##
## Sum of squares and products for the hypothesis:
## V10 V12 V16 V23
## V10 14.4746321 11.53835665 3.27713852 0.6236801
## V12 11.5383567 24.41095765 0.05604968 0.0196499
## V16 3.2771385 0.05604968 14.34669471 -4.9049821
## V23 0.6236801 0.01964990 -4.90498206 6.0417494
##
## Sum of squares and products for error:
## V10 V12 V16 V23
## V10 845.2686 484.0144 332.7784 304.6463
## V12 484.0144 1418.5489 354.8061 383.3887
## V16 332.7784 354.8061 1158.7445 241.4170
## V23 304.6463 383.3887 241.4170 804.9742
##
## Multivariate Tests:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 7 0.0593362 1.140062 28 2120.000 0.279521
## Wilks 7 0.9418382 1.138088 28 1901.548 0.281986
## Hotelling-Lawley 7 0.0605158 1.135752 28 2102.000 0.284486
## Roy 7 0.0253478 1.919188 7 530.000 0.064482 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Jointly, there is no significant interaction.
Putting everything together, Gender has a significant main effect across all four response variables. Big gaps in Education (HSGrad vs noHS) produces significantly different responses, smaller educaitonal gaps (someHS vs noHS) produces little difference.
Overall, support for crime‐prevention measures changes with gender and to some extent with completing high school, but the gender gap itself does not depend on a person’s education level.
crime$Income <- crime$V87
options(contrasts = c("contr.treatment","contr.poly"))
j <- function(x) jitter(x, factor = 1)
panel_jitter <- function(x, y, ...) {
xi <- j(x); yi <- j(y)
points(xi, yi, pch = 16, col = rgb(0, 0.5, 1, 0.4), ...)
ab <- coef(lm(y ~ x))
abline(ab, col = 2, lwd = 2)
}
pairs(
crime[c("Income","V10","V12","V16","V23")],
panel = panel_jitter,
main = "Income vs. Attitudes (jittered, with fit lines)",
labels = c("Income","V10","V12","V16","V23")
)
Because of the categorical nature of the data, it is hard to tell if the variables are linearly associated.
crime_mod2 <- lm(
cbind(V10, V12, V16, V23) ~ Gender * Education + Income, data = crime
)
summary(crime_mod2, test = "Pillai")
## Response V10 :
##
## Call:
## lm(formula = V10 ~ Gender * Education + Income, data = crime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.6623 -0.4831 0.2068 0.8031 2.4239
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.58615 0.63245 8.832 <2e-16 ***
## GenderMale -0.51728 0.72221 -0.716 0.4742
## EducationsomeHS -0.04962 0.77375 -0.064 0.9489
## EducationHSGrad -1.10276 0.65439 -1.685 0.0926 .
## EducationCollege1yr -0.53563 0.67204 -0.797 0.4258
## EducationCollege2yr -0.44732 0.71090 -0.629 0.5295
## EducationCollege3yr -1.01310 0.78157 -1.296 0.1955
## EducationCollegeGrad -0.47306 0.69675 -0.679 0.4975
## EducationGradDegree -0.28791 0.71830 -0.401 0.6887
## Income -0.08615 0.04232 -2.036 0.0423 *
## GenderMale:EducationsomeHS -0.18467 0.88021 -0.210 0.8339
## GenderMale:EducationHSGrad 0.65503 0.74851 0.875 0.3819
## GenderMale:EducationCollege1yr 0.07535 0.77755 0.097 0.9228
## GenderMale:EducationCollege2yr -0.28124 0.82296 -0.342 0.7327
## GenderMale:EducationCollege3yr -0.04892 0.90203 -0.054 0.9568
## GenderMale:EducationCollegeGrad 0.31801 0.79013 0.402 0.6875
## GenderMale:EducationGradDegree -0.06720 0.80645 -0.083 0.9336
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.262 on 503 degrees of freedom
## (39 observations deleted due to missingness)
## Multiple R-squared: 0.06814, Adjusted R-squared: 0.0385
## F-statistic: 2.299 on 16 and 503 DF, p-value: 0.002905
##
##
## Response V12 :
##
## Call:
## lm(formula = V12 ~ Gender * Education + Income, data = crime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0888 -1.5092 0.3668 1.1877 2.6241
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.33356 0.81728 5.302 1.71e-07 ***
## GenderMale -0.03831 0.93327 -0.041 0.967
## EducationsomeHS -0.05189 0.99986 -0.052 0.959
## EducationHSGrad -0.10345 0.84562 -0.122 0.903
## EducationCollege1yr 0.25543 0.86843 0.294 0.769
## EducationCollege2yr 0.92234 0.91865 1.004 0.316
## EducationCollege3yr -0.39521 1.00997 -0.391 0.696
## EducationCollegeGrad 0.29422 0.90037 0.327 0.744
## EducationGradDegree 1.14134 0.92821 1.230 0.219
## Income -0.08356 0.05468 -1.528 0.127
## GenderMale:EducationsomeHS -0.44970 1.13744 -0.395 0.693
## GenderMale:EducationHSGrad -0.08884 0.96726 -0.092 0.927
## GenderMale:EducationCollege1yr -0.12948 1.00478 -0.129 0.898
## GenderMale:EducationCollege2yr -1.24358 1.06345 -1.169 0.243
## GenderMale:EducationCollege3yr 0.21365 1.16563 0.183 0.855
## GenderMale:EducationCollegeGrad 0.01932 1.02104 0.019 0.985
## GenderMale:EducationGradDegree -1.11314 1.04212 -1.068 0.286
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.631 on 503 degrees of freedom
## (39 observations deleted due to missingness)
## Multiple R-squared: 0.04708, Adjusted R-squared: 0.01676
## F-statistic: 1.553 on 16 and 503 DF, p-value: 0.07739
##
##
## Response V16 :
##
## Call:
## lm(formula = V16 ~ Gender * Education + Income, data = crime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8759 -1.3336 0.3349 1.1064 2.9746
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.64387 0.73304 7.699 7.31e-14 ***
## GenderMale -1.55425 0.83707 -1.857 0.06393 .
## EducationsomeHS -0.62412 0.89680 -0.696 0.48679
## EducationHSGrad -1.17076 0.75846 -1.544 0.12331
## EducationCollege1yr -0.95283 0.77891 -1.223 0.22180
## EducationCollege2yr -0.82828 0.82396 -1.005 0.31526
## EducationCollege3yr -2.22937 0.90587 -2.461 0.01419 *
## EducationCollegeGrad -1.56548 0.80756 -1.939 0.05312 .
## EducationGradDegree -1.47893 0.83254 -1.776 0.07627 .
## Income -0.14387 0.04905 -2.933 0.00351 **
## GenderMale:EducationsomeHS 0.96065 1.02020 0.942 0.34684
## GenderMale:EducationHSGrad 1.32174 0.86756 1.524 0.12826
## GenderMale:EducationCollege1yr 1.06378 0.90121 1.180 0.23840
## GenderMale:EducationCollege2yr 0.81875 0.95384 0.858 0.39109
## GenderMale:EducationCollege3yr 1.74060 1.04549 1.665 0.09656 .
## GenderMale:EducationCollegeGrad 1.67262 0.91580 1.826 0.06838 .
## GenderMale:EducationGradDegree 1.73128 0.93471 1.852 0.06458 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.463 on 503 degrees of freedom
## (39 observations deleted due to missingness)
## Multiple R-squared: 0.07387, Adjusted R-squared: 0.04442
## F-statistic: 2.508 on 16 and 503 DF, p-value: 0.00104
##
##
## Response V23 :
##
## Call:
## lm(formula = V23 ~ Gender * Education + Income, data = crime)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7173 -0.5513 0.1075 0.8161 2.0245
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.56689 0.61075 7.477 3.39e-13 ***
## GenderMale 0.39246 0.69743 0.563 0.574
## EducationsomeHS 0.05853 0.74720 0.078 0.938
## EducationHSGrad 0.21732 0.63193 0.344 0.731
## EducationCollege1yr 0.49230 0.64898 0.759 0.448
## EducationCollege2yr 0.20046 0.68651 0.292 0.770
## EducationCollege3yr 0.18395 0.75475 0.244 0.808
## EducationCollegeGrad 0.62120 0.67284 0.923 0.356
## EducationGradDegree 0.40072 0.69365 0.578 0.564
## Income -0.06689 0.04086 -1.637 0.102
## GenderMale:EducationsomeHS -0.57595 0.85001 -0.678 0.498
## GenderMale:EducationHSGrad -0.65830 0.72283 -0.911 0.363
## GenderMale:EducationCollege1yr -0.84786 0.75087 -1.129 0.259
## GenderMale:EducationCollege2yr -0.78045 0.79472 -0.982 0.327
## GenderMale:EducationCollege3yr -0.80511 0.87108 -0.924 0.356
## GenderMale:EducationCollegeGrad -1.20366 0.76302 -1.577 0.115
## GenderMale:EducationGradDegree -0.61052 0.77878 -0.784 0.433
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.219 on 503 degrees of freedom
## (39 observations deleted due to missingness)
## Multiple R-squared: 0.04712, Adjusted R-squared: 0.01681
## F-statistic: 1.555 on 16 and 503 DF, p-value: 0.07688
summary.aov(crime_mod2)
## Response V10 :
## Df Sum Sq Mean Sq F value Pr(>F)
## Gender 1 9.88 9.8843 6.2055 0.01306 *
## Education 7 28.76 4.1088 2.5795 0.01277 *
## Income 1 6.71 6.7061 4.2102 0.04070 *
## Gender:Education 7 13.23 1.8904 1.1868 0.30851
## Residuals 503 801.19 1.5928
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response V12 :
## Df Sum Sq Mean Sq F value Pr(>F)
## Gender 1 17.21 17.2053 6.4686 0.01128 *
## Education 7 20.11 2.8735 1.0803 0.37458
## Income 1 5.82 5.8172 2.1871 0.13980
## Gender:Education 7 22.96 3.2796 1.2330 0.28265
## Residuals 503 1337.89 2.6598
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response V16 :
## Df Sum Sq Mean Sq F value Pr(>F)
## Gender 1 16.75 16.7497 7.8278 0.005342 **
## Education 7 36.37 5.1961 2.4283 0.018731 *
## Income 1 17.94 17.9351 8.3818 0.003955 **
## Gender:Education 7 14.80 2.1138 0.9879 0.439156
## Residuals 503 1076.30 2.1398
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response V23 :
## Df Sum Sq Mean Sq F value Pr(>F)
## Gender 1 19.49 19.4926 13.1228 0.0003213 ***
## Education 7 7.15 1.0219 0.6879 0.6823514
## Income 1 4.45 4.4515 2.9968 0.0840399 .
## Gender:Education 7 5.85 0.8363 0.5630 0.7860845
## Residuals 503 747.15 1.4854
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## 39 observations deleted due to missingness
For V10 and V16, we see significant negative association with income. Effects from Education and Gender still remain significant, but interaction term is no longer significant.
V12 is very stable and no predictors are statistically significant.
V23 shows significant difference in responses across genders.
Overall, adding the income term doesn’t change the overall pattern of significance in the multivariate tests, but it explains additional variation, especially in V10 and V16.
crime_mva2 <- manova(
cbind(V10, V12, V16, V23) ~ Gender * Education + Income, data = crime)
cqplot(crime_mva2$residuals, label = "MANOVA residuals")
Model assumptions are well-met.
mod_gender <- manova(
cbind(V10, V12, V16, V23) ~ Gender,
data = crime
)
summary(mod_gender, test = "Pillai")
## Df Pillai approx F num Df den Df Pr(>F)
## Gender 1 0.030128 4.2014 4 541 0.002332 **
## Residuals 544
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
aov_gender <- summary.aov(mod_gender)
p_gender <- sapply(
aov_gender,
function(x) x["Gender", "Pr(>F)"]
)
p_adj_gender <- p.adjust(p_gender, method = "holm")
gender_contrasts <- data.frame(
P_value = p_gender,
Holm_Adj_P = p_adj_gender
)
print(gender_contrasts)
## P_value Holm_Adj_P
## Response V10 0.011028302 0.015269863
## Response V12 0.005089954 0.015269863
## Response V16 0.005303752 0.015269863
## Response V23 0.001343217 0.005372868
Even after adjusting for multiple comparisons, all four items show a statistically significant gender difference. In each case, the mean support among women differs from men, so gender is a robust predictor. The smallest adjusted p is for V23 (“FamilyHelp”), indicating the strongest gender gap there.